Multi-Stream Acoustic Modelling Using Raw Real and Imaginary Parts of the Fourier Transform
نویسندگان
چکیده
In this paper, we investigate multi-stream acoustic modelling using the raw real and imaginary parts of Fourier transform speech signals. Using magnitude spectrum, or features derived from it, as a proxy for leads to irreversible information loss suboptimal fusion. We discuss quantify importance such in terms quality intelligibility. proposed framework, are treated two streams information, pre-processed via separate convolutional networks, then combined at an optimal level abstraction, followed by further post-processing recurrent fully-connected layers. The fusion various architectures, training dynamics cross-entropy loss, frame classification accuracy WER well shape properties filters learned first layer single- models analysed. investigated effectiveness systems tasks: TIMIT/NTIMIT (phone recognition), Aurora-4 (noise robustness), WSJ (read speech), AMI (meeting) TORGO (dysarthric speech). Across all tasks achieved competitive performance: Aurora-4, down 4.6% on average, 6.2% WERs Eval-92 Eval-93, Dev/Eval sets AMI-IHM 23.3%/23.8% AMI-SDM 43.7%/47.6% have been achieved. TORGO, dysarthric typical 31.7% 10.2% WERs, respectively.
منابع مشابه
Acoustic Seabed Classification using Fractional Fourier Transform
In this paper we present a time-frequency approach for acoustic seabed classification. Work reported is based on sonar data collected by the Volume Search Sonar (VSS), one of the five sonar systems in the AN/AQS-20. The Volume Search Sonar is a beamformed multibeam sonar system with 27 fore and 27 aft beams, covering almost the entire water volume (from above horizontal, through vertical, back ...
متن کاملon the relationship between using discourse markers and the quality of expository and argumentative academic writing of iranian english majors
the aim of the present study was to investigate the frequency and the type of discourse markers used in the argumentative and expository writings of iranian efl learners and the differences between these text features in the two essay genres. the study also aimed at examining the influence of the use of discourse markers on the participants’ writing quality. to this end the discourse markers us...
15 صفحه اولUsing the Fourier Transform
We give a quantum algorithm for solving a shifted multiplicative character problem over Z/nZ and finite fields. We show that the algorithm can be interpreted as a matrix factorization or as solving a deconvolution problem and give sufficient conditions for a shift problem to be solved efficiently by our algorithm. We also show that combining the shift problem with the hidden subgroup problem re...
متن کاملReal Clifford Windowed Fourier Transform
We study the windowed Fourier transform in the framework of Clifford analysis, which we call the Clifford windowed Fourier transform (CWFT). Based on the spectral representation of the Clifford Fourier transform (CFT), we derive several important properties such as shift, modulation, reconstruction formula, orthogonality relation, isometry, and reproducing kernel. We also present an example to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2023
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2023.3237167